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Abstract 

The quality of data plays an Important role in business analysis and decision making, and data accuracy is an important 
aspect in data quality. Thus one necessary task for data quality management is to evaluate the accuracy of the data. And in 
order to solve the problem that the accuracy of the whole data set is low while a useful part may be high, it is also necessary 
to evaluate the accuracy of the query results, called relative accuracy. However, as far as we know, neither measure nor 
effective methods for the accuracy evaluation methods are proposed. IVlotivated by this, for relative accuracy evaluation, we 
propose a systematic method. We design a relative accuracy evaluation framework for relational databases based on a new 
metric to measure the accuracy using statistics. We apply the methods to evaluate the precision and recall of basic queries, 
which show the result's relative accuracy. We also propose the method to handle data update and to improve accuracy 
evaluation using functional dependencies. Extensive experimental results show the effectiveness and efficiency of our 
proposed framework and algorithms. 



(D 

CrossMark 



Citation: Zhang Y, Wang H, Yang Z, Li J (2014) Relative Accuracy Evaluation. PLoS ONE 9(8): el03853. doi:10.1371/journal.pone.0103853 
Editor: Zhen Wang, Center of nonlinear, China 

Received April 4, 2014; Accepted July 2, 2014; Publislied August 18, 2014 

Copyriglit: © 2014 Zhang et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits 
unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. 

Data Availability: The authors confirm that all data underlying the findings are fully available without restriction. Data are generated by the generator TPCH. 

Funding: This paper was partially supported by NGFR 973 grant 2012CB316200 and NGFR 863 grant 2012AA01 1004. The funders had no role in study design, 
data collection and analysis, decision to publish, or preparation of the manuscript. 

Competing interests: The authors have declared that no competing interests exist. 

* Email: wangzh@hit.edu.cn 



Introduction 

Data quality problem plays an important role in business 
analysis and decision making [1-4], and has been studied in 
different areas, such as statistics, management science, and 
computer science [5]. Dirty data is a major reason for data 
quality problem. Many surveys reveal that dirty data exists in most 
database systems. For example, a survey [6] reports that over 65% 
of the inventory records at retailer Gamma were inaccurate at the 
store-SKU level. The consequences of dirty data may be severe. 
Dirty data with uncertainty, duplication or inconsistency may 
leads to ineffective marketing, operational inefficiencies, inferior 
customer relationship management, and poor business decisions. 
For example, it is reported [7] that dirty data in retail databases 
alone costs US consumers $2.5 billion a year. Hence it is extremely 
urgent to estimate data's quality before they are used. 

Data quality has many aspects including accuracy, inconsisten- 
cy, concurrency and completeness. Among them, accuracy is an 
important one. Accuracy is defmed as the closeness degree 
between the measurements of a value and corresponding actual 
(true) value. In many applications, inaccurate data will mislead the 
decision. To make sure the usage of data, the accuracy of data 
should be estimated before they are used. Our preliminary work 
studies the accuracy evaluation on the whole data set [8] , which is 
called absolute accuracy. 

A case is that the accuracy of the whole data set is low but that 
of a share containing the query results may be high, so it is 
necessary to evaluate the query result's accuracy which is called 
relative accuracy. For example, we have a database which collects 
the sensors' data. After some time, a sensor gets wrong, so the 
quality of such database becomes low. But if we want to query 
some data with timestamp before the time that the sensor gets 



wrong, the database could return results with high quality. With 
such cases, it is important to evaluate the accuracy of the query 
and query result. 

Another example application that will be benefit from our 
method is metaknowledge [9-12], large corpora of written text, 
both scientific and literature - which is becoming increasingly 
available in digitized form. The accuracy estimation could be used 
to evaluate the quality of metaknowledge and further evaluate its 
usability. 

With its importance, the estimation of relative accuracy brings 
following technical challenges. 

• The data may be from different data sources and in different 
data model with different accuracy, including structured data 
model, semi-structured data model and even unstructured data 
model. The relative accuracy evaluation method should be 
adapted to all these models. 

• Among the data set, diEFerent values may refer to the same 
real- world entity, and we need to estimate the true value of the 
entity attribute if the entity does not have theexplicit one. 

• There are many different types of data. For different types, 
different estimation approached method should be applied. 

• There are many types of queries. Query analysis needs to be 
executed and the precision and recall of the query results needs 
to be evaluated. 

Current work seldom considers the evaluation of accuracy with 
diflferent data types. Only our preliminary work [8] proposes 
evaluation method for absolute accuracy [13]. considers accuracy 
estimation. However, in that paper, only the category type is 
considered. And also for a value, in their system, this value can 
only be considered as true value or false value. But actually in real 
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applications, there are many other data types. For example, in 
sensor network, the true value is 1.0. For two data sensors A and B, 
the measured result of A is 2.0 and that of B is 1.5. Clearly, the 

accuracy of B is better than that of A. In such case, the accuracies 
of A and B cannot be distinguished even though they are different. 

Even thougli witii true value estimation methods [14-16], mean 
squared error (MSE), which is the expected value of the squared 
error loss or quadratic loss. MSE measures the average of the 
squares of the "errors"., can be used to estimate the accuracy 
directly. However, the truth discovery methods are not related to 
accuracy and are not suitable for the accuracy estimation. And 
different data types also have different evaluation method. In order 
to unify the accuracy measurement metric of different data types, 
we define a new accuracy metric ARE(average relative error) 
which is based on the mean value of data values' relative error. 

To evaluation ARE, \v(; propose a relative accuracy evaluation 
framework for relational database with different data types, which 
could also be extended to other data model. This paper makes the 
following contributions. 

1) We propose a general accuracy evaluation framework mainly 
for relational database with different data types, which could 
also be extended to other data models. 

2) According to the differences in evaluation method for data in 
various types, we classify the data types into three classes. 

3) We propose efficient accuracy evaluation algorithms for three 
data types in two cases of in presence and absence of true 
values. 

4) We design the strategy to handle data update and the method 
to use the functional dependency to improve accuracy 
evaluation. 

5) We propose the methods to evaluate the precision and recall 
of the basic query operations and to evaluate the overall 
accuracy of the query results, which will be combined to 
compute the relative accuracy of the query. 

In the following parts, we first introduce the framework of the 
relative accuracy evaluation. As our framework is based on the 
accuracy of the attributes, we develop attribute accuracy 
evaluation algorithms for each category in cases of in presence 
and absence of true values, and show how our framework works at 
these situations. We also propose the strategy to handle data 
updating and to use functional dependency to improve accuracy 
evaluation. 

The rest part of this paper is organized as foUowings. Section 2 
proposes the basic framc^vork of relative accuracy evaluation. 
Section 3 and Section 4 discuss the evaluation algorithms in 
presence and absence of true values, respectively. Section 5 gives 
the method to handle data updating and the strategy to improve 
accuracy evaluation using functional dependencies. The experi- 
mental results and analysis are shown in Section 6. Section 7 
discusses the related work and section 8 draws the conclusions. 

Framework 

As we know, a relational database consists of relational tables, a 
relational table consists of tuples, and a tuple consists of different 
attributes. Therefore, we use the accuracy of attributes to evaluate 
tuples' accuracy, use the accuracy of tuples to evaluate the table's 
accuracy, and use the accuracy of tables to evaluate the database's 
accuracy. As a result, we convert the problem to evaluate the 
accuracy of the attributes [8] . This strategy also could be extended 
to other data models. The evaluation of the accuracy of a data 



object can be a combination of the evaluation of its attributes' 
accuracy. 

Using attributes as the basic unit of evaluation does not mean 
the neglect of the relationships between the attributes. We note 
that latent relationships among the attributes wiU affect the 
accuracy evaluation. It is defined as entity relationship. It means 
that different attribute value mays describe the same attribute of a 
real-world entity. With entity relationships, during the evaluation, 
some attributes with different values may share the same true 
value. We would use this character as a base to compute the 
accuracy of the measured data, since* if all th<; mc'asured data are 
independent, we could not compute the error distribution without 
enough priori knowledge. Other attribute types are similar. 

With above discussions, our evaluation methods will take 
attributes as basic units and consider the relationship among them. 
In this section, we propose an overview of the evaluation 
framework. At first, we show the framework of the relative 
accuracy evaluation; and then the attributes are classified 
according to the different accuracy evaluation methods, which 
would be used as the first step of framework; at last, we describe 
the methods to compute the rough accuracy of the basic query 
operations. Such accuracy could be used to define and deduce 
other operations and this would give users the first impression 
about the query. 

2.1 Accuracy Evaluation Framework 

The framework of the relative accuracy evaluation includes four 
phrases. 

1 . The types of attributes are classified according to the evaluation 
methods of attributes. 

2. The accuracy for each type of attributes is evaluated. 

3. The rough accuracy of the query is [:omputed and users would 
decide whether the query is suitable to be executed. 

4. The precision, recall, F-measure of query and the absolute 
accuracy of the query's results are computed, which are 
combined to show the relative accuracy of queries. 

The first phrase is performed according to data format and data 
semantics [8] . For example, for numerical value including integral 
numbers and floating numbers, it is obvious that it belongs to the 
measurable type; string data and set data belong to the 
comparable data type, and gender and level data belong to the 
category data. 

In the second phrase, we use statistics theor)- to comj)utc the 
accuracy of attribute. As different data types have different 
dimension, in order to unify the accuracy measurement metric of 
different data types, we define a new accuracy metric which use 
the mean value of data values' relative error to represent the data's 
accuracy. We use it as the accuracy measure for values in the same 
attribute. The details of this phrase will be described in Section 3 
and Section 4 for the cases of presence and absence of true values, 
respectively. 

In the third phrase, we first give the rough accuracy of query 
using the accuracy of attributes based on the probability analysis. 
This step wiU give users the first impression about the query, and 
this is performed offline which will be very efficient though it 
maybe not so accurate. 

In the fourth phrase, we compute the precision, recall, F- 
measure of the query and the absolute accuracy of the query's 
results. The precision of a source s is the probability of its positive 
claims being correct; the sensitivity or recall of a source .s is the 
probability of true facts being claimed as true. A measure that 
combines precision and recall is the harmonic mean of precision 
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and recall, the traditional F-measure is as follows. 



^■p = (l + P^)x P^~-recaU 



P~ X precision + recall 

We use P as our evaluation criteria to describe the relative 
importance between recall and precision. A special case is P= I, 
where recall and precision are evenly weighted.For absolute 
accuracy of a data set, we use the average of ARE of different types 
of attributes to represent it. It is denoted as follows. 



ARE- 



J. ( 1 — accuracy t) 



(2) 



where T is the set of attributes, and accuracjt is the accuracy of 
attribute t. Since the quality of the query is not only related to the 
accuracy of query attributes and but also the global accuracy of the 
results, in order to obtain the relative accuracy of the query, we 
need to consider both of them. Therefore, we use the following 
quadruple to represent the relative accuracy of the query results. 

r accuracy = (Pf ecision,K ecall,F — measure,ARE) (3) 

where Precision, Recall and F-measure are the precision, recall 
and F-measure of the result, and ARE is the accuracy of the result 
set. 

2.2Absolute Accuracy Evaluation 

Since the attributes may be in various categories, although the 
semantics of accuracy on them are the same, the accuracy 
computation methods of them are different. According to their 
difference, the attributes are classified into three types [8]. 

• Measurable Attribute: The values of such attribute are 
continues and can be modeled as some distribution. Such 
attributes include the values from the measure instruments, 
such as temperature and humidity. 

• Comparable Attribute: The values of such attribute are not 
continues and no distribution can be derived from the values. 
However, the difference of such values can be computed. That 
is, the distance between the input value and true value can be 
computed. For example, both the name attribute and some set 
attribute belong to this type. 

• Category Attribute: The difference between two values of 
such attribute cannot be computed. The difference of such 
attribute can only have a rank instead of concrete value. For 
example, the gender attribute and the rank attribute. 

As different data types have different dimension, the accuracy 
metric is proposed as well as the accuracy evaluation method of a 
given data set according to the data type. We will introduce the 
metrics and evaluation methods in Section 3 and Section 4. 

2.3 Query analysis and the Probability Calculation 

The quality of the query results is related to the accuracy of 
query attributes and the overall accuracy of the results, in order to 
obtain the relative accuracy of the query, we need to consider both 
of them. We wiU first introduce the query analysis and its rough 
accuracy calculation approaches. 

The operations of queries are varied, such as selection, 
projection, join, division, union, difference, intersection and 
Cartesian. They can be defined and derived by five basic 
operations, selection, projection, union, difference and Cartesian 



product. The following is the analysis and rough accuracy 
evaluation of five base operations. 

2.3.1 Selection. The selection is also known as the restriction. 
It selects tuples from database which have to satisfy the given 
conditions, denoted as ap{R)= {t\teR''-F{t) = 'true'}, where F 
represents the selection criteria. F is a logical expression, which 
takes a logical value of true or false. The basic form off is X6Y, 
where 6 represents a comparison operator, and it can be >, > , <, 
^, = or <>. And X or Y may represents an attribute name, a 
constant or as a simple function. We can further carry out logic 
operations on these basic selection criteria, such as non (^), and {'% 
or (V). The probability calculation is based on the accuracy of the 
attribute. If X or Y is a constant, then we can only use the 
accuracy of attributes which are used to represent the query 
accuracy; if both of X and Y are attributes, we use their accuracy's 
production to represent the query probability, that is 
Ppi^i) = PxxPy, the corresponding probability formula for — i is 
P-i4 = 1 -Pa; for '\ it is Pab = Pa ^Pb; for V, it is Pa\b=Pa+ 
Pb-PaxPb- 

2.3.2 Projection. The projection on the relation R is to select 
some particular attributes to form a new relation from R. It is 
denoted as 7t^(/?) = {t{A) \ teR}, where A is a set of attributes from 
R. We need not to compute its accuracy, as it wiU select the entire 
column. We can use the mean value of the accuracy of selected 
attributes to represent the rough projection accuracy. For 
example, if the selected attributes are A and B, then the rough 
projection accuracy = (accuracy(A)+(accuracy(B))/2. 

2.3.3 Union. The union of relation R and the relation S is 
denoted as R\JS = {t\teR'VteS}, where R and S share the same 
attributes. As the relation union wiU only remove the tuples 
belonging to both two relations, we use the formula Prcs— 1~ 
Pr xPs to represent the rough probability of union. 

2.3.4 Difference. The difference of relation R and the 
relation S is denoted as R—S = {t\leR''t^S}, where R and S 
share the same attribute. As in the diflFerence, the dataset R wUl 
only removes the tuples belonging to the second set. We use the 
formula Prs = Pr x(1 — P<;) to represent the rough probability of 
difference. 

2.3.5 Cartesian Product. The Cartesian product considered 
here is exactly the extended Cartesian product, since the unit is 
tuple. The Cartesian product of relation R with m attributes and 
relation S with n attributes is a relation containing m + n 
attributes. It is denoted asR x S = {Jr'ikreR A JjeS}, and it is 
generally not used directly, but as the basic of join and other 
operations. We use the formula PrxS = Pr>^Ps to represent the 
rough probabUity of Cartesian product. However, if the Cartesian 
product only uses a portion of relations just like equijoins and 
natural join, we could only use the product of the accuracy of 
attributes which are used to represent the rough probabUity of 
Cartesian product. 

Example 2.1 The join could combine by selection and 
Cartesian product, and is also called 6 join, which is to select some 
tuples satisfy certain conditions from the Cartesian product of two 
relations. It is denoted as JoiM^eB(i^,S) = {/r'jI'rER A f^eSA 
/r[y4]0?,5[B]}, where ^ and 5 represent some comparable attributes 
from R and S and 9 is the comparison operator. Two of the most 
important and also the most common join are equijoin and natural 
join. The 6 of the equijoin is " = ",which means selected tuples 
which have equal attribute values at the attribute A and B from the 
Cartesian product of two relation, denoted as JoinA = B(R,^) = 
{?r?s|/,.eR A ?,5eS A ?,.[^] = ?,5[B]; the natural join is a special 
equijoin, which request not only the equal attribute value but also 
the same attribute type, denoted as Join(R,S) = {trts\trS^f\ 
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t,eSAtr[B] = t,[B]}. We can use the formula Pjoinjoi„o{AB)(^>^) — 
PaxPbX Pr X Ps to represent its rough query result probability. 

In this paper, to simplify the discussion, we treat that all data objects 
and types share the same importance. Tlic accuracy in case that data 
objects or types have different importance could be evaluated by 
adding weights on each item in the accuracy evaluation formula. 

In Section 3 and 4, we wiU propose attribute accuracy 
evaluation algorithms for data type in each category in cases of 
in presence and absence of true values, and show how our 
framework works at these situations. 

Accuracy Estimation with the True Values 

Accuracy is defined as the closeness degree between the 
measurements of a quantity and the quantity's actual (true) 
value.As different data types have different dimension, we need a 
metric to measure the accuracy of different data types. We first 
propose a new metric to uniform describe the accuracy of different 
data types and then describe how to evaluate the accuracy at the 
idcEil situation in which the attributes have true values. 

In statistics theory, mean squared error (MSE) is often used to 
estimate the accuracy of observations. However, different data 
type.s have different dimension, in order to unify the accuracy 
evaluation metric of different data types, we define a new standard 
ARE(average relative error) which use tiic mean value of data 
values' relative error to represent the data's accuracy. The relative 
error of a parameter 9 is denoted as: RE{9) = \6 — 6\/\6\, where 9 is 
true value of a parameter and the 9 is the observation of 9. And 
the ARE of attribute is denoted as follows 

^J?£(i))=l-^^^ (4) 

Where D is the set of the attribute values, w is a value which belong 
to D and RE(v) is the relative error of w. In the remaining part of 
this paper, we also use D to denote the set of attribute values. 

In presence of true values, the computation of ARE looks trivial. 
However, for different data types, the computation of ARE is 
different. We will discuss the evaluation methods for different data 
types with true valuc-s, respectively. 

In this section, the evaluation methods involve true values. In 
order to distinguish true values from the values of attributes in the 
data set which possibly contain inaccuracy or even false values, in 
the remaining part of this paper, we use observations to refer the 
value of attributes in data set. 

Measurable Attribute: For measurable attributes, the 
accuracy for a set of observations is computed as foUowings. 

Where is the true value of v. With true value, the ARE is 
computed as the average of the relative accuracy between the 
observations and the true value. 

Comparable Attribute: For comparable attributes, we 
define the distance function first, and the accuracy evaluation of 
comparable type is computed as following. 

where is true value relative to observation value v and \ ty \ is the 
length of ty. Distance is a distance function defined on the 



comparable data type, for example, it can be edit distance for 
string data, or Jaccard distance for set data type. 

Category Attribute: For category attributes, the difference 

between values cannot be computed as before. Thus, the ARE is 
computed as the expectation of the observation equaling to the 
true value. It is denoted as 

AREiD) = 1 - idiff{t.,v)/m (7) 

where for the function diffQ is computed from, if ^J, = w, it returns 
0; others, it is computed according to the rank of the difference 
between and v.To computer difJQ, we model the values in a 
category attribute as a graph G = (F, £), where V is the set of all 
values and each {u,v)eE represents that v is the most similar to u 
among all values in V. Then diB[u,v) is defined as the length of the 
shortest path in G. 

Precision and recall: Theprecision and recall of a query can 
be computed according to the definition. We denoted the case that 
the observation is true and the fact is true as TPs, that the 
observation is false but the fact is true as FNs, that the observation 
is true but the fact is false as FP^, andthe case that the observation 
is false and the fact is false as TNs- The precision of a query is 
denoted as precision - TPS/{TPs+FPs), and the recall is denoted 
as recall = TPs/{TPs+FNs). With true values, they are easy to 
compute. 

Accuracy Estimation without True Values 

In many cases, the true value for an attribute is unknown. In this 
case, the accuracy computation is more difficult and the true 
values need to been estimated with existing observations. Based on 
the observations, we estimate the accuracy without true values for 
different data types. 

4.1 Measurable Data Type 

For measurable data t5'pc, \vc noted that if all the data are 
independent to each other, it is impossible to compute the true 
values and get the accuracy of the data without enough priori 
knowledge. Since we could not often obtain enough priori 
knowledge and many tuples may describe the same entity, we 
( ould use the entity resolution technology [17] to find tuples which 
describe the same entity. Then we obtain a series of measurable 
data which share the same true values. We first compute the ARE 
of every entity, and then use them to compute the ARE of the 
whole data set. 

Generally, in a certain sample volume, the metric which evaluate 
the quality of point estimation is always the distance function which 
measures the distance between the point estimate value 6 and the 
true parameter valued. The most commonly used function is the 
square of the distance, and because of the randomness, we can 
compute the expectation of the function. The mean square error 

MSE{6) = E{9 — 9)^ is the most general metric of point estimation. 
And naturally, we wish to estimate the MSE as small as possible. 

Notice that : MSE(9) = E{9-9f = E[i9-E9) + (E9-9)f (8) 



= E(9 - E9f + {E9 - 9f + 2E[{9 - E9){E9 - 9)] (9) 



= Var(9) + {E9-9f (10) 
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As we can see, the MSE is composed by two parts which are the 

variance of the point estimation and the square of the deviation. In 
the case of the certain sample volume, the variance of 0 is certain. As 
long as 9 is an unbiased estimation of 9, we can minimize the MSE 
9. As we know, in a series of observation values, the mean value is an 
unbiased estimation of the true value, so we can use the mean value 
to represent the true value to minimize the MSE. 

Under the case of minimizing the MSE, we use the mean value 
to compute the ARE of each entity. As a result, we can get the 
following formula. 



ARE(Ei)=l- 



where x is the average value of all x in £i 



ARE{D) = 



\D\ 



(11) 



(12) 



4.2 Comparable Type 

Unlike the measurable type, it is hard to find true value for 
comparable type. As it is alsohard to define the mean value of the 
attribute of an entity, we define a new measure to find the most- 
hketrue values as followings. 



(13) 



Dis tan ce{Ti,Oi) < CjOiSO 

where the function Distance is to measure the distance between the 
observation o, and its true value T,, and s, is a variable 
representing the relative error for observation O;. Distance can 
be the edit distance for strings or Jaccard similarity for sets. 

As far as we can see, the most possible true value of one entity's 
attribute is one of values which describes that entity, but also 
maybe do not appear with very small probability. It is almost 
impossible to obtain the true value without enough prior 
knowledge, if it does not appear. We usually could not have 
enough prior knowledge in the real world, so we choose the true 
value from the observation values. Maybe, we could not get the 
true value, but it is a really small probabihty event. 

We denote different observations as O = {0],02,...,o„} and the 
true value as t. We use the foUow metric to choose the possible true 
value from the observation values. 



min F(0)=— Y."= 1 ta^i ce^iT,od 



(14) 



where T is the selected true value of all observations Oj. 

By enumerating every unique observation value, we could get 
the most possible true value which minimize F{0). Though the 
value t is the biased estimation of the true value, it can minimize 
the distance function. 

Using the estimation value t, we get the following formula: 



j:^,^iDistanceio,t)/\t\) 
ARE{E,) - 1 



(15) 



ARE{D)-- 



Y.E,eD\Ei\^ARE{Ed 
\D\ 



(16) 



Our evaluation method could alsoachieveO(n) time complexity 
with the entity resolution technology which using the hash 
method. 

4.3 Category Type 

4.3.1 Model. For category types, we also utilize the entity 
resolution technology. We denote each entity as e and the set of entity 
as E. We assume that the tuples share the same model, which belong 
to one same entity. We denote the possible true values of the entity as 
T. Since for a category attribute, the only information for the true 
value is from the observations. It means that without external 
knowledge, the true value should be one of the observations. The 
parameters of the model are defined as following:0 = {|i;r}, where m 
represents the probability of the true value is Z;, r represents the error 
transition matrix which is a |t| x|t| matrix and its element ri2 
represents the probability of the observed value is t2 in the case of the 
true value is ti. The accuracy of one entity is defined as 
ARE{e)=\-Y^,^^j.\i^Y.t2eT,H^t2^ndiff(,h,t2). Therefore, we 
compute the global accuracy as ARE{D) = Y^EeD \Ei\A-RE{Ei)/\D\. 

4.3.2 Solutions. Based on the model, we attempt to use EM 
algorithm [20] to estimate the parameters of the model. The 
observable variable of model denoted as (), the latent v'ariable 
denoted as T, the parameter denoted as 9. The likelihood function 
of the observable variable denoted as following: P{0\9) = 
Y,TPiO,T\9)=Y,TP^T\9)P{0\T,9). The goal is to compute 
the maximal likelihood estimation for 9. 

Now, we design EM algorithm to solve this problem. At first, 
is initialized by this way: Ht is initialized by choosing a random 
value from range (0,1), and it is need to make sure X^,e7-j"( = l; 
r(, is initialized by choosing a random value from the range of 
(0,1), and it needs to make sure X]r,e7- '''i.'i ~ 1- diff{t,o) is 
defined according to Section 3. 

At the E step, 9*'' denotes the ith iteration value of the estimation 
valued. In the next step, the following formula needs to compute. 



2(0,(?«) = ET[logP{T,OmO,9^' 



(17) 



= J2^PiT\0,9'^'^)\ogP{T,0\9) (18) 



For a specific true value t, P(T,0\9) = jiiTlogoft,o- 



P{T\0 = P(^\(^^)P(0\T,9^'^) _ i/p n„,o r« 



P(O|0®) 



^teT A ' rioeO ''lo 



(19) 



Because for a specific true value t, YteT IIosO can be 
seen as a constant, so it can be neglected as our goal is the 
evaluation value of 9 when to maximize Qj9,9^'^). Finally, we get 

e(e.e«)= J2,.^R%log,^,+ J2,.tEo.o'^>S'-'.o (20) 
where R% = nf ^oeo r^'l- 
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In the M step, the estimation of 0*'^ ' for the i+lth iteration is 
computed as the 6 to maximize Q19,9^'\ Then E step and M step 
are repeated until coverage. 

With the condition oi "^^i^r = ^ ''^"d VZieT, ;',,_,, = 1, 

the problem of computation optimal 9 is converted to the 
following optimization problem. 

max/((?)= J2..T^<>^t''+ Z,erE.eo<olog^« (21) 



Subject to 



V?ier,X;,26r '■(1,(2 = 1 



It is supposed that T= {ij,t2,...,ii}.Using Lagrange duality and 
Lagrange multiplier, we get the Lagrange function as following. 



L(0,A,a) = - ^ log ^% log r<,o 



(22) 



Setting the gradient Ve_^_o,Z,(0,A,a) = 0 yields the system of 
equations as following. 



dL _ Rt\,o 



+ 1 



dL _ Rf„oEoeodiff{tuo) 



(23) 



dL RtoT.o.odiffM 



dL _ 

dai, 



8L 



^'k''k 
E<er'''i.'-1 



We can get the solution of equations as following: 



ET>(f> 
teT ^t,0 



'^' = ^% E,er E„eo eqn{t,o),\ltut2eT,r,^^,^ 



(24) 



(25) 



Example 4.1: Suppose that an observation set is {A, A, A, A, 

B, B, C, C}, we could get the parameters of the model as 
e={H;r}, where n= {nA,liB,Hc} and r = {r^^,r^£,r^c7ij/i/M, 
^bc'/ca/cb/cc}- The parameter |i is initialized as {0.5, 0.25, 
0.25} and r is initialized as {0.6, 0.2, 0.2; 0.3, 0.4, 0.3; 0.4, 0.2, 
0.4}. Then, we could use the formula (24) and (25) to iterate until 
the parameters convergence. At last, we could get the accuracy of 
the entity. 

4.4 Implementation 

In this subsection, we introduce the implementation issues for 
the evaluation methods. 

Accuracy Evaluation for Measurable Data Type: To 

implement such evaluation, we perform entity resolution with 
hashing [17] at first. Then ARE is computed for each entity 
according to Eq. (1 1). At last, the global ARE is computed based 
on Eq. (12). Thus, our evaluation method could get 0(n) time 
complexity. 

Accuracy Evaluation for Comparable Data Type: To 

implcmc'iit the {'valuation for measurable attribute a, we also 
perform entity resolution on the data [17] at first. Then, for each 
entity e with all possible values O — {oi,02,...o„} in the attribute a, 
we enumerate each ojeS in as the true value and compute 

v, = 77^y^"_. Dis tan ce^(oj,Oi) according to Eq. (14). After that, 
\0\ ^ — 1 

the 0, leading to the minimal vi is selerted as t and ARE for e is 
computed according to Eq. (15). At last, ARE of the global dataset 
is computed according to Eq. (16). 

Accuracy Evaluation for Category Data Type: According 
to Section 4.3,the evaluation is accomplished with EM algorithm. 
As the framework of EM algorithm, random values are assigned to 
parameters ji and r. Then/i and r keep on updating iteratively 
according to Eq. (24) and Eq. (25) until convergence. After 
convergence, with fi and r, the accuracy of a single entity e is 
computed as ^Jf^Ce) = 1 - J^t^eT J^tj^TA^h ^ndiff{t\,t2) and 
then the global accuracy is computed as ARE{D)= YIeeD 
\Ei\AmEd/\D\. 

4.5 Precision and Recall without true values 

Without true values, the precision and recall of the qnery is 
difficult to compute. In order to get the accuracy of query which 
represents how close it is to the real situation, we would use the 
truth to find methods discussed above to evaluate the precision 
and recall of the query results. 

For measurable attribute types, we use the mean value x of the 
values which share the same true value to represent the true value; 
for comparable attribute type, we use the value t which could 
minimize the function F(0) denoted as formula(14)to represent the 
true value. For category attribute types, using the model in Section 
4.3.1, we use the value t with the largest |t, to evaluate the true 
value. For category attribute type, we could also use maximum 
likelihood estimation to find the true value, just as we use the value 
account for the largest proportion of all the values which share the 
true values to represent the true value We could also use the 
proposed relative accuracy computation method to assign the 
tuple attributes weight factor to determine the true value for 
category attribute type. 

With the evaluated true value, we can use formula TPg/ (TPs+ 
FPg) and TPs/(TPs+FNs) proposed in Section 3 to compute the 
precision, recall and F-measure of the query. 

Our framework could also handle the dynamic data updating, 
we will talk about it in the next section, as well as how to improve 
accuracy evaluation using the relationships between the attributes. 
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Data Update and Functional Dependency 

In Section 3 and Section 4, we propose the accuracy estimation 
methods. It assumes that the data set is static, but actually the data 
set always changes. In this section, we discuss how to handle data 
updating. As we find that the, relationships between the attributes 
could be used to improve the accuracy evaluation, we will also 
discuss it in this section. 

5.1 Data update 

In order to adapt our fi-amework to data updating and avoid 
recomputing the accuracy using the whole data set, we need to 
consider methods to handle data update. In order to facilitate 
calculation, we need to record the computed data. Two kinds of 
information need to record, one is the accuracy of attributes, and 
the other is the entity relationship between the tuples, which 
means some tuples referring to the same real-world entity. 

There are three kinds of data updating operations, data 
modification, tuple insertion and tuple deletion. For data update, 
we denote the entity before modified as£ and that after modified 
as E'. We denote their accuracies 2LiiARE(E) and ARE{E'), 
respectively. The data attribute's accuracy before modified is 
denoted by^ii£(7^. We denote the data set as D. Since data 
modification does not change the size of the data set, we propose 
following formula to update the accuracy of attribute. 



ARE(T') = ARE{T) + x 



ARE{Ei')-ARE{Ei) 
\D\ 



(26) 



For tuple insertion and tuple deletion, we dc-notc the size of data 
set after operation as Z)'. Then we propose the following formula 
for accuracy updating. 



ARE{T') = 



\D\ARE{T) - \Ei\ARE{Ei) + \Ei'\ARE{Ei') 
W\ 



(27) 



From formula (26) and (27), we can see that, if the size of data 
set is very large and the accuracy change is small, we need not to 
update the accuracy of attribute timely. We can update the 
accuracy after the number of change up to a constant number, 
which can be set manually. It can facilitate the relative accuracy 
evaluation algorithm. 

5.2 Improving accuracy evaluation using functional 
dependency 

When we defined the schema of relational database, we usually 
have functional dependencies between attributes. The functional 
dependency is defined as follows. Given a relation S with attributes 
set ^7(fil,52^•••^8n), ^, ^ are subsets oiU. For any two tuples of S, 
if u[X] =v[X], then we can get u\Y\ =y\Y\. We called this as Y 
functional dependence by X, denoted as X-^Y. We can change the 
query plan using the functional dependency. For example, if X-^F, 
the query on attribute Y could convert attribute X. From this point, 
we propose the method to accelerate accuracy evaluation. 

5.2.1 Accuracy range for global accuracy of data set. As 
functional dependency exists in most databases and the query plan 
could be replanned and executed using only a part of attributes set. 
We can use a small attributes set to represent the whole attributes. 
Based on this point, we can use the accuracy of partial attributes' 
to represent the accuracy of the whole data set. 



We attempt to use functional dependencies between attributes 
to discover more information between attributes, and mainly to 
find candidate keys. As we know, = U from the knowledge of 
functional dependencies and closure, so we can represent the 
whole tuple using the candidate keys. For a query on an ordinary 
attribute, we can get the new query plan by functional 
dependencies and query rewriting. Hence we can determine the 
accuracy of a dataset using the accuracy of candidate keys. 

Candidate keys discovery algorithms have been studied in 
[18] [19] and are not the focus of this paper. With candidate keys, 
can filter out some attributes with low accuracy but can be 
deduced by candidate keys. This can make great improve on 
accuracy evaluation of data set. 

Suppose a table have two attributes, A and B, and attribute B 
depends on attribute A. All queries about attribute B can be 
transformed into a query on attribute A, and we can get the upper 
and lower bounds of the table's accuracy according to the 
accuracy of A and B. If ARE{A)>AREiE), the accuracy of the 
table belong to the range {ARE{B)4RE{A)); ii ARE{A)<ARE(B), 
the accuracy of the table is in range (ARE(A)/IRE{B)). 

Usually, there are more than one candidate keys in the 
relational schema. Assuming that the set of candidate keys is 



{Xi,X2 



our strategy is as follows. We first sort the 



attributes of relation according to their accuracy computed before. 
We then find the attributes which are not candidate keys but their 
accuracy is higher than the minimum accuracy of candidate keys. 
They form the set X^, and we can get the range which the 
accuracy of relation belong to. That is, 

Accuracy{R)e{min{ARE{XiX^), . . . ,ARE{X„,X^)), 
man{ARE{XiX^), . . . ,ARE{X„Xj,))) 



We use an example to illustrate the strategy. 

Example 5.1: The relational schema is R(A,B,C,D,E) and its 
functional dependencies are Y = {A— »BC, CD— >E, B->D, 
E^Aj.By candidate keys discovery algorithm, we can get the 
following candidate keys: 



( A+=ABCDE 
E+ =ABCDE 
{BC)+ =ABCDE 

XCD)+ =ABCDE 



As shown above, we know the candidate keys of R are A, E, BC, 
CD. Compute {ARE{A), AREiB), ARE(BC), ARE(CD)}, denoted 
ARE„i„=min{ARE{A),ARE{B),ARE(BC),ARE{CD)) and 
ARE,„ax = max(ARE(A),ARE(B),ARE(BC),ARE(CD)), we 

can get the accuracy range of R as Accma.cy(R)e(ARE„^in, 
AREiji'^x)- 

5.2.2 Suggestions for improving query accuracy. As a 

query plan could be reenacted using functional dependencies, so 
we can use it to improve the relative accuracy of queries. 

If we can find the mapping relation between attributes using 
functional dependency, then we can apply this to improve the 
relative accuracy of query. For example, suppose each place name 
corresponds with only one encoding, denoted encoding as 
attribute X and place name as attribute F,then the mapping could 
be denoted as X-^F. If the accuracy of attributes has already been 
computed and marked, when a query is on attribute Y, if the 
accuracy of Y is higher than X, then we can execute query on Y 
directiy; if the accuracy of X is higher than Y, then we can execute 
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Table 1. Main Notation. 





Notation 


Meaning 


P-Actual 


Actual precision 


P-Evaluate 


Estimated precision 


R-Actual 


Actual Recall 


R-Evaluate 


Estimated recall 


F-Actual 


Actual F-Measure 


F-Evaluate 


Estimated F-Measure 


G-Accuracy 


The accuracy of data source 


Result Accuracy 


The accuracy of query result 


Offline Evaluation 


The possible accuracy of query using the attribute accuracy calculated offline 





doi:l 0.1 371/journal.pone.Ol 03853.1001 



on X by mapping rules. For mapping rule making process, we can 
collect all the values when X and Y appear concurrently to make 
mapping rule or table. In addition, we can also use the closure of 
schema to find all the firnctional dependencies. Through 
attributes' accuracy record and the fiinctional dependency 
between attributes, we can reenact the query plan, thereby 
increasing the relative accuracy of the query. 

Experimental Results 

In order to evaluate the performance and efficiency of the 
relative accuracy evaluation, we carried out a series of experi- 
ments. In this section, we describe the process through which we 
obtained the test data. Hereafter, we carry out extensive 
experiments on basic queries and analyze their results. To the 
best of our knowledge there are no publicly available systems 
which direcdy evaluate the relative accuracy of queries and the 
global accuracy of query results. Most of the query estimation 
algorithms focus on how to produce the high quality results 
relative to query condition, but they do not generally involve the 
global accuracy of the result set. We do not only care about the 
accuracy of query, but also the accuracy of the query results. Our 
experiments are conducted on a 3 GHz Inter(R) Core(TM) 2 Duo 
CPU with 4 GB main memory. 



6.1 Test data 

Since there is no benchmark dataset available for evaluating the 
performance of our accuracy evaluation framework, in order to 
obtain a representative test dataset for verifying the effectiveness of 
our framework on evaluate the precision, recall, F-measure and 
the overall accuracy of query result, we use the toolkit of TPC_H 
to generate the test data. TPC_H is a toolkit provided by TPC 
which is an abbreviation of the Transaction Processing Perfor- 
mance Council; it is primarily used for OLAP test and to estimate 
the performance of business analysis in decision support systems; 
in addition, it contains a complete set of business-oriented ad-hoc 
queries and concurrent data modifications. 

Firsdy, we used the toolkit to generate the dataset, since die 
redundancy often exists in the real-world database. That is, there 
are usually more than one tuple describing one entity, so we then 
use one tuple as an entity and generate a tuple set whose number is 
randomly selected from 1 to 10, and meanwhile manually added 
errors to tuples in the set. In the case of synthetic labeled, we use 
the small data set which the tuples' number is IK and 5K to 
evaluate the queries and the overall performance; we test the 
performance of data set's absolute accuracy whose data size is 
lOK, 20K, 30K, 40K and 50K, respectively; we also use large 
datasets to perform the efficiency experiments, whose data sizes 
are 20K, 40K, 60K, 80K and lOOK, respectively. 

For performance experiments, the precision, recall, F-measure 
and the global accuracy of query results are used as our evaluation 




Figure 1. The experimental results of the comparison between the accuracy in presence and absence true values, denoted as True 
and Estimation, respectively. From the results, the evaluation of accuracy is littler than the true situation. 
doi:10.1371/journal.pone.0103853.g001 
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0.78 - 

0.7i - 

0.68 - 
0.63 

0.58 - 

0.53 - 
0.48 



♦ True Estimation 



lOK 



20K 



30K 



40K 



50K 



Figure 2. The experimental results of accuracy evaluation with functional dependencies. The range of the estimated accuracies and true 
values are shown as lines and dots, respectively. From the results, the true accuracy closes to the upper range, (a) IK Selection (b) 5K Selection. 
dol:1 0.1 371/journal.pone.01 03853.g002 



criteria. For efficiency experiments, we use the ratio of tlie 
evaluation time and the actual execution time as the evaluation 
metric. When the operation is only related to the attributes of one 
dataset, the rough accuracy would be used to compare with the F- 
measure of the query to show the performance of the query. The 
others will be used to compare with the accuracy of the query 
results. To facilitate the description of the experimental results, we 
firsdy summarize the main notations that will be used in the 
experimental part in Table 1. 



6.2 The performance of absolute accuracy evaluation 

We use the small data set to test the performance of global 
accuracy evaluation, and formula (2). The data size is lOK, 20K, 
30K, 40K and 50K, respectively. The results are shown in 
Figure 1. 

As we can see from Figure 1, the evaluation of accuracy is a 
littie lower than the true situation, but the deviation is little. Since 
the data came from one test instrument, the result is similar. 

In order to improve the accuracy evaluation, we take into 
account of functional dependencies between attributes, consider 




1.2 



■ P-Actual 

■ P- Evaluate 

■ R-Aaual 

:: R- Evaluate 

■ F- Actual 

F- Evaluate 
n G_Accuracv 

■ Result 

G Accuracy 

■ Offline 
Evaiuation 



0.8 - 



0.6 - 



0.4 - 



0.2 - 



■ P-Aaual 

■ F»- Evaluate 

■ R-Aaual 
R- Evaluate 

■ F- Actual 
F-Evaluate 

r- G_Accur»cv 

■ Result 
G_AccufacY 

■ Offline 
Evaluation 



Figure 3. Experimental results for relative accuracy estimation of selection queries with different constraints, where we show P- 
Actuai, P-Evaiuate, R-Actual, R-Evaluate, G-Accuracy, Result Accuracy and Offline Evaluation with data size 1 K and 5K. The meanings 
of these measures are shown In Table 1. (a) IK Relation Union, Difference and Natural Joln(b) 5K Relation Union, Difference and Natural Join, (a) IK 
Attributes Union (b) 5K Attributes Union. 
dol:1 0.1 371/journai.pone.01 03853.g003 



PLOS ONE I www.plosone.org 



9 



August 2014 | Volume 9 | Issue 8 | e103853 



Relative Accuracy 



1.2 -I 



0.8 



0.6 



0.4 



0.2 




■ P-Actual 

■ ^Evaluate 

■ R-Actual 
R-Evaliiate 

■ F- Actual 
F-Evaluat< 

■ G_Accuracv 

■ Result 

G_Accuracv 

■ Offline 
Evaluation 



Figure 4. Experimental results for relative accuracy estimation of union queries with different sets, where we show P-Actual, P- 
Evaluate, R-Actual, R-Evaluate, G-Accuracy, Result Accuracy and Offline Evaluation with data size IK and 5K. The meanings of these 
measures are shown in Table 1. 
doi:10.1371/journal.pone.0103853.g004 



only candidate keys and attributes with high accuracy, and remove 
the attribute with low accuracy. The results are shown in Figure 2. 

As Figure 2 shows, the accuracy evaluation show in range form, 
and the method is efTect, since the attributes have been pruned. 
And the true accuracy closes to die upper range. 



6.3 The performance of relative accuracy evaluation 

As mentioned before, all queries can be defined and derived by 
selection, projection, union, difference and Cartesian product. We 
carry out experiments to test the performance of selection, 




■ P- Actual 


1.2 1 


■ P-Evaluat< 






1 - 


■ R-Actual 




R-Evaluate 


O.S - 



iF-Aaual 

F- Evaluate 

i G_Accuf acv 
1 

1 6_Accuracv 
2 

I Result 

G Accuracy 
I offline 

Evaluation 



0.6 



0.4 



0.2 



RelatlonUnlonRelationExcei« natural Join 



Relation 
Union 



Relation NativalJoin 
Excei>t 



■ P-Aaual 

■ P-Evaluate 

■ R-Actual 

r R-Evaluate 

■ F-Aaual 
F-Evaluate 

■ G.Accuracvi 

■ G_Accuracv2 

■ Result 

G Accwacv 

■ offline 
Evaluation 



Figure 5. Experimental results for relative accuracy estimation of relational union, difference and natural join, where we show P- 
Actual, P-Evaluate, R-Actual, R-Evaluate, G-Accuracy 1, G-Accuracy 2, Result Accuracy and Offline Evaluation with data size 1K and 

5K. The meanings of these measures are shown in Table 1 with G-Accuracy 1 and G-Accuracy 2 representing the accuracies of two input relations. 
doi:1 0.1 371/journal.pone.01 03853.g005 
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20k 40k 60k 80k 100k 



Figure 6. Experimental results on the scalability for accuracy 
estimation for selection queries with different constraints. The 

data size range from 20l< to 100l< and the unit of run time (y-axis) is 
second (s). 

doi:10.1371/journal.pone.0103853.g006 

attributes union, relations union, relation difference and natural 
join. 

6.3.1 Selection. For selection, we perform experiments on 
three different attribute types independently. For the measurable 
types, the selection conditions include only one boundary and two 
boundaries; for comparable types, the selection conditions only 
include equivalent selection; for category types, the situation is the 
same as the comparable types. The results are shown in Figure 3. 

As we can see from Figure 3, precision, recall and F-measure of 
the comparable and category type are very close to the true 
situation, and the error is within 10% of the exact evaluation; for 
the measurable type, as we use the mean value to represent the 
true value of entity. Sometimes it will appear large error ratio 
when the query boundary closed to the true value and the attribute 
itself with low accuracy, but the error is within 1 5 % of the exact 
evaluation; as it is the operation between attributes, compared 
with actual F-measure, the offline estimation is slightly lower, but 
the error is within 15% of the actual estimation. In summary, our 
evaluation framework could give a good estimation for selection. 

6.3.2 Union. We first carry out experiments on attribute 
union, it belongs to selection af{R) = {t\teR A F{t) = 'tme'}, 
where F{t) = fi{ti) V /2(f2) V • ■ ■ V /„(?„). As there are three 
different attribute types, we tested all possible combinations of 
three types. The results are shown in Figure 4. 

As observed from these figures, precision, recall and F-measure 
of attributes union are slightly lower than the true situation, but 
the error is within 5% of the exact values; as it is the operation 
between attributes, comparing with actual F-measure, the offline 
estimation is slightly higher. In practical applications, the offline 
estimation can be multiplied by a scaling factor which is less than 1 
to improve the estimation accuracy of offline. As a conclusion, our 
evaluation framework could give a good estimation for relation 
union. 

6.3.3 Relation Union. The relation union between dataset R 
and S is to find tuples which belong to R or S. The two datasets 
share some entities, but the errors added to the two datasets are 
independently. The results are shown in Figure 5. 

As observed from these figures, precision, recall and F-measure 
are slightly lower than the true situation, but the error is within 5 % 
of the exact values; as it is the operation between sets, compared 
with result's global accuracy, the ofiline estimation is slightly lower, 



4.5 -, 





L5 - 
1 - 

0 5 — •— MVComVCat 
— ■— MVCom 

0 MVtat"' ' 

20K 40K 60K 80K lOOK 

Figure 7. Experimental results on the scalability for accuracy 
estimation for attribute union queries with different sets. The 

data size range from 20l< to 100l< and the unit of run time (y-axis) is 
second (s). 

doi:1 0.1 371/journal.pone.01 03853.g007 

but the error is within 10% of the estimation accuracy. To sum up, 
our evaluation framework could give a good estimation for relation 
union. 

6.3.4 Relation Difference. The relation difference between 
dataset R and 5 is to fmd tuples which belong to R but not S. For 
relations difference, the data set is same as relations union. The 
results are shown in Figure 5. 

From these figures, precision, recall and F-measure fluctuate 
around the true situation, but the error is within 5 % of the exact 
values; as it is the operation between sets, compared with result's 
global accuracy, the offline estimation is slightly lower, but the 
error is within 10% of the estimation accuracy. In summary, our 
evaluation framework could give a good estimation for difference. 

6.3.5 Natural Join. For Join, we only perform experiments 
on natural join, and others have the similar situations. The used 
attribute for join is comparable attribute. The results are shown in 
Figure 5. 

From the experimental results, precision, recall and F-measure 
are slightly lower than the true situation, but the error is within 5 % 
of the exact values; as it is the operation between sets, compared 
with result's global accuracy, the offline estimation is slightly lower, 
but the error is within 5% of the estimation accuracy. In 
conclusion, our evaluation framework could give a good estima- 
tion for join. 

6.4 The efficiency of relative accuracy evaluation 

In order to test the efficiency of our framework, we execute 
experiments on chfiFerent data sets with sizes 20k,40k,60k, 80k and 
1 00k, respectively. We use the ratio of the evaluation time and the 
actual execution time as the evaluation metric and perform 
experiments on selection, attributes union, relations union, 
relations difference and natural join. The results are shown in 
Figure 6, Figure 7 and Figure 8, respectively. 

From these figures, with the growth in the amount of data, for 
single attribute selection, the ratio of measurable attribute stable in 
2 nearby; the ratio of comparable attribute stable in 3.5 nearby; 
the ratio of category attribute stable in 1.3 nearby. As the 
comparable attributes' calculation is related to the calculation of 
the edit distance, so that it takes a long time. For attributes union, 
the ratio is also stabilized in a constant with the increase in the 
amount of data; for relations union and relations difference, the 
ratio stable is in 1.2 nearby; for natural join, the ratio stable is in 5 
nearby, this is mainly because the attributes' number of the result 
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Figure 8. Experimental results on the scalability for accuracy 
estimation for relation union, except queries and natural join. 

The data size range from 20l< to 100l< and the unit of run time (/-axis) is 
second (s). 

doi:10.1371/journal.pone.0103853.g008 

set is larger than the former relations. Since our framework 
evaluates not only the precision and recall of the query, but also 
the query result's global accuracy, the ration is larger than l.As a 
summary, with the amount of data increases, our estimation 
framework can achieve linear time. 

Conclusioii: We carry out extensive performance and 
efficiency experiments on selection, attributes union, relations 
union, relations difference and natural join. For those queries, our 
evaluation methods could give accuracy estimation which is very 
close to the accuracy of the true situation, and for large amount of 
data, our algorithm can achieve linear time. 

Related Work 

There are two classes of work related to our research, truth 
discovery and query evaluation. There are several studies related 
to the truth discovery. Resolving inconsistency [21] and modeling 
source quality [22] have been discussed in the context of data 
integration. Later [14] was the first to formally introduce the truth- 
finding problem. Then [23] developed several new algorithms and 
applied integer programming to enforce constraints on truth data 
[24]; designed a framework that can incorporate background 
information [25]; proposed an EM algorithm for truth finding in 
sensor networks. The copying relationship between sources was 
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7. Enghsh L (2000) Information quality management: The next frontier. DM 
Review Magazine. 

8. Zhang Y, Wang H (2014) Accuracy Evaluation for Sensed Data. The 
Proceedings of WASA 2014: 205-214. 
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Studied in [15]. But we consider the truth discovery from the point 
the entity recognition technology which was different from the 
previous works. 

For query evaluation, many studies have focused on providing 
approximate answers to queries, but these techniques approximate 
query results based only upon a subset of data. In [26], Vrbsky et. 
al. studied how to provide approximate answers to set-valued 
queries. Other techniques use pre-computation [27], sampling 
[28] and synopses [29] to produce statistical results. Koch and 
Gotz [30] study the reliability of query results, but their goal is to 
provide a compositional framework for queries over unrehable 
data resulted from approximate query processing; Perez et al. 
study the evaluation of probabilistic threshold queries in MCDB 
[31]. But not as the precious work, our paper considers not only 
the relative accuracy of the query, but also the overall accuracy of 
query results. 

Conclusions 

In this paper, we study the quality of the queries and design a 
relative accuracy evaluation framework for multi-modal data. 
Within this framework, we classify data types into three categories 
and develop accuracy evaluation algorithms for each category in 
cases of in presence and absence of true values. We present novel 
metric ARE for measuring the accuracy of one entity in statistic 
way, and also show the methods to evaluate the precision and 
recall of the basic queries, which would be used to combine with 
the absolute accuracy of query results to show the result's relative 
accuracy. Our framework could be easily extended to the big data, 
as we use the entity resolution technology as the foundation. We 
also propose the method to handle data update and to improve 
accuracy evaluation using functional dependencies. Extensive 
experimental results show the effectiveness and efficiency of our 
proposed framework. 

As future work, we plan to combine the quality and copy 
relationship of data sourcesto improve the effectiveness of our 
framework. 
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